300 research outputs found

    Web services and workflow management for biological resources

    Get PDF
    BACKGORUND: The completion of the Human Genome Project has resulted in large quantities of biological data which are proving difficult to manage and integrate effectively. There is a need for a system that is able to automate accesses to remote sites and to "understand" the information that it is managing in order to link data properly. Workflow management systems combined with Web Services are promising Information and Communication Technologies (ICT) tools. Some have already been proposed and are being increasingly applied to the biomedical domain, especially as many biology-related Web Services are now becoming available. Information on biological resources and on genomic sequences mutations are two examples of very specialized datasets that are useful for specific research domains. RESULTS: The architecture of a system that is able to access and execute predefined workflows is presented in this paper. Web Services allowing access to the IARC TP53 Mutation Database and CABRI catalogues of biological resources have been implemented and are available on-line. Example workflows which retrieve data from these Web Services have also been created and are available on-line. CONCLUSION: We present a general architecture and some building blocks for the implementation of a system that is able to remotely execute workflows of biomedical interest and show how this approach can effectively produce useful outputs. The further development and implementation of Web Services allowing access to an exhaustive set of biomedical databases and the creation of effective and useful workflows will improve the automation of in-silico analysis

    BITS 2015: The annual meeting of the Italian Society of Bioinformatics

    Get PDF
    This preface introduces the content of the BioMed Central journal Supplements related to the BITS 2015 meeting, held in Milan, Italy, from the 3th to the 5th of June, 2015

    In silico saturation mutagenesis and docking screening for the analysis of protein-ligand interaction: the Endothelial Protein C Receptor case study

    Get PDF
    BACKGROUND: The design of mutants in protein functional regions, such as the ligand binding sites, is a powerful approach to recognize the determinants of specific protein activities in cellular pathways. For an exhaustive analysis of selected positions of protein structure large scale mutagenesis techniques are often employed, with laborious and time consuming experimental set-up. 'In silico' mutagenesis and screening simulation represents a valid alternative to laboratory methods to drive the 'in vivo' testing toward more focused objectives. RESULTS: We present here a high performance computational procedure for large-scale mutant modelling and subsequent evaluation of the effect on ligand binding affinity. The mutagenesis was performed with a 'saturation' approach, where all 20 natural amino acids were tested in positions involved in ligand binding sites. Each modelled mutant was subjected to molecular docking simulation and stability evaluation. The simulated protein-ligand complexes were screened for their impairment of binding ability based on change of calculated Ki compared to the wild-type. An example of application to the Endothelial Protein C Receptor residues involved in lipid binding is reported. CONCLUSION: The computational pipeline presented in this work is a useful tool for the design of structurally stable mutants with altered affinity for ligand binding, considerably reducing the number of mutants to be experimentally tested. The saturation mutagenesis procedure does not require previous knowledge of functional role of the residues involved and allows extensive exploration of all possible substitutions and their pairwise combinations. Mutants are screened by docking simulation and stability evaluation followed by a rationally driven selection of those presenting the required characteristics. The method can be employed in molecular recognition studies and as a preliminary approach to select models for experimental testing

    Modelling the interaction of steroid receptors with endocrine disrupting chemicals

    Get PDF
    BACKGROUND: The organic polychlorinated compounds like dichlorodiphenyltrichloroethane with its metabolites and polychlorinated biphenyls are a class of highly persistent environmental contaminants. They have been recognized to have detrimental health effects both on wildlife and humans acting as endocrine disrupters due to their ability of mimicking the action of the steroid hormones, and thus interfering with hormone response. There are several experimental evidences that they bind and activate human steroid receptors. However, despite the growing concern about the toxicological activity of endocrine disrupters, molecular data of the interaction of these compounds with biological targets are still lacking. RESULTS: We have used a flexible docking approach to characterize the molecular interaction of seven endocrine disrupting chemicals with estrogen, progesterone and androgen receptors in the ligand-binding domain. All ligands docked in the buried hydrophobic cavity corresponding to the hormone steroid pocket. The interaction was characterized by multiple hydrophobic contacts involving a different number of residues facing the binding pocket, depending on ligands orientation. The EDC ligands did not display a unique binding mode, probably due to their lipophilicity and flexibility, which conferred them a great adaptability into the hydrophobic and large binding pocket of steroid receptors. CONCLUSION: Our results are in agreement with toxicological data on binding and allow to describe a pattern of interactions for a group of ECD to steroid receptors suggesting the requirement of a hydrophobic cavity to accommodate these chlorine carrying compounds. Although the affinity is lower than for hormones, their action can be brought about by a possible synergistic effect

    Framing Apache Spark in life sciences

    Get PDF
    Advances in high-throughput and digital technologies have required the adoption of big data for handling complex tasks in life sciences. However, the drift to big data led researchers to face technical and infrastructural challenges for storing, sharing, and analysing them. In fact, this kind of tasks requires distributed computing systems and algorithms able to ensure efficient processing. Cutting edge distributed programming frameworks allow to implement flexible algorithms able to adapt the computation to the data over on-premise HPC clusters or cloud architectures. In this context, Apache Spark is a very powerful HPC engine for large-scale data processing on clusters. Also thanks to specialised libraries for working with structured and relational data, it allows to support machine learning, graph-based computation, and stream processing. This review article is aimed at helping life sciences researchers to ascertain the features of Apache Spark and to assess whether it can be successfully used in their research activities

    The Genome Conformation As an Integrator of Multi-Omic Data: The Example of Damage Spreading in Cancer.

    Get PDF
    Publicly available multi-omic databases, in particular if associated with medical annotations, are rich resources with the potential to lead a rapid transition from high-throughput molecular biology experiments to better clinical outcomes for patients. In this work, we propose a model for multi-omic data integration (i.e., genetic variations, gene expression, genome conformation, and epigenetic patterns), which exploits a multi-layer network approach to analyse, visualize, and obtain insights from such biological information, in order to use achieved results at a macroscopic level. Using this representation, we can describe how driver and passenger mutations accumulate during the development of diseases providing, for example, a tool able to characterize the evolution of cancer. Indeed, our test case concerns the MCF-7 breast cancer cell line, before and after the stimulation with estrogen, since many datasets are available for this case study. In particular, the integration of data about cancer mutations, gene functional annotations, genome conformation, epigenetic patterns, gene expression, and metabolic pathways in our multi-layer representation will allow a better interpretation of the mechanisms behind a complex disease such as cancer. Thanks to this multi-layer approach, we focus on the interplay of chromatin conformation and cancer mutations in different pathways, such as metabolic processes, that are very important for tumor development. Working on this model, a variance analysis can be implemented to identify normal variations within each omics and to characterize, by contrast, variations that can be accounted to pathological samples compared to normal ones. This integrative model can be used to identify novel biomarkers and to provide innovative omic-based guidelines for treating many diseases, improving the efficacy of decision trees currently used in clinic

    Solving biclustering with a GRASP-like metaheuristic: two case-study on gene expression analysis

    Get PDF
    The explosion of "omics" data over the past few decades has generated an increasing need of efficiently analyzing high-dimensional gene expression data in several different and heterogenous contexts, such as for example in information retrieval, knowledge discovery, and data mining. For this reason, biclustering, or simultaneous clustering of both genes and conditions has generated considerable interest over the past few decades. Unfortunately, the problem of locating the most significant bicluster has been shown to be NP-complete. We have designed and implemented a GRASP-like heuristic algorithm to efficiently find good solutions in reasonable running times, and to overcome the inner intractability of the problem from a computational point of view. Experimental results on two datasets of expression data are promising indicating that this algorithm is able to find significant biclusters, especially from a biological point of view

    An Unusual Distribution of 6-nt Sequences Near The Transcription Start Site

    Get PDF
    SummaryA new look at the transcription start is presented in which we can see transcription factors binding to both sides of the TSS as an essential requirement. Naturally the factor binding to the downstream region must be removed so that transcription process can continue. The presence of a number of distinct transcription factors also can be used to explain selective activation of various genes. The transcription start site by itself plays only a minor role in the whole process. We also suggest that mutations close to the TSS on the coding side can be fatal even if preserves the codon table
    • …
    corecore